ProtoNet : Navigating the Hierarchical Clustering of the Protein Space
نویسندگان
چکیده
The ProtoNet site provides an automatic hierarchical clustering of the protein space. The clustering is based on an all-against-all BLAST similarity test. With this similarity measure we proceed to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy of clusters of varying degrees of granularity. This clustering can be used for function prediction, for define superfamilies and subfamilies and for large-scale protein annotation purposes. ProtoNet is accessible in the form of an interactive Web site at http://www.protonet.cs.huji.ac.il.
منابع مشابه
ProtoNet: hierarchical classification of the protein space
The ProtoNet site provides an automatic hierarchical clustering of the SWISS-PROT protein database. The clustering is based on an all-against-all BLAST similarity search. The similarities' E-score is used to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a...
متن کاملProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree
ProtoNet 6.0 (http://www.protonet.cs.huji.ac.il) is a data structure of protein families that cover the protein sequence space. These families are generated through an unsupervised bottom-up clustering algorithm. This algorithm organizes large sets of proteins in a hierarchical tree that yields high-quality protein families. The 2012 ProtoNet (Version 6.0) tree includes over 9 million proteins ...
متن کاملProtoNet 4.0: A hierarchical classification of one million protein sequences
ProtoNet is an automatic hierarchical classification of the protein sequence space. In 2004, the ProtoNet (version 4.0) presents the analysis of over one million proteins merged from SwissProt and TrEMBL databases. In addition to rich visualization and analysis tools to navigate the clustering hierarchy, we incorporated several improvements that allow a simplified view of the scaffold of the pr...
متن کاملEntropy-driven partitioning of the hierarchical protein space
MOTIVATION Modern protein sequencing techniques have led to the determination of >50 million protein sequences. ProtoNet is a clustering system that provides a continuous hierarchical agglomerative clustering tree for all proteins. While ProtoNet performs unsupervised classification of all included proteins, finding an optimal level of granularity for the purpose of focusing on protein function...
متن کاملFishing With (Proto)Net—A Principled Approach to Protein Target Selection
Structural genomics strives to represent the entire protein space. The first step towards achieving this goal is by rationally selecting proteins whose structures have not been determined, but that represent an as yet unknown structural superfamily or fold. Once such a structure is solved, it can be used as a template for modelling homologous proteins. This will aid in unveiling the structural ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006